Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Movie/Script: Alignment and Parsing of Video and Text Transcription

Identifieur interne : 000C09 ( Main/Exploration ); précédent : 000C08; suivant : 000C10

Movie/Script: Alignment and Parsing of Video and Text Transcription

Auteurs : Timothee Cour [États-Unis] ; Chris Jordan [États-Unis] ; Eleni Miltsakaki [États-Unis] ; Ben Taskar [États-Unis]

Source :

RBID : ISTEX:4D113318F9911978071D0A7B8FD0031994AF3C74

Abstract

Abstract: Movies and TV are a rich source of diverse and complex video of people, objects, actions and locales “in the wild”. Harvesting automatically labeled sequences of actions from video would enable creation of large-scale and highly-varied datasets. To enable such collection, we focus on the task of recovering scene structure in movies and TV series for object tracking and action retrieval. We present a weakly supervised algorithm that uses the screenplay and closed captions to parse a movie into a hierarchy of shots and scenes. Scene boundaries in the movie are aligned with screenplay scene labels and shots are reordered into a sequence of long continuous tracks or threads which allow for more accurate tracking of people, actions and objects. Scene segmentation, alignment, and shot threading are formulated as inference in a unified generative model and a novel hierarchical dynamic programming algorithm that can handle alignment and jump-limited reorderings in linear time is presented. We present quantitative and qualitative results on movie alignment and parsing, and use the recovered structure to improve character naming and retrieval of common actions in several episodes of popular TV series.

Url:
DOI: 10.1007/978-3-540-88693-8_12


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Movie/Script: Alignment and Parsing of Video and Text Transcription</title>
<author>
<name sortKey="Cour, Timothee" sort="Cour, Timothee" uniqKey="Cour T" first="Timothee" last="Cour">Timothee Cour</name>
</author>
<author>
<name sortKey="Jordan, Chris" sort="Jordan, Chris" uniqKey="Jordan C" first="Chris" last="Jordan">Chris Jordan</name>
</author>
<author>
<name sortKey="Miltsakaki, Eleni" sort="Miltsakaki, Eleni" uniqKey="Miltsakaki E" first="Eleni" last="Miltsakaki">Eleni Miltsakaki</name>
</author>
<author>
<name sortKey="Taskar, Ben" sort="Taskar, Ben" uniqKey="Taskar B" first="Ben" last="Taskar">Ben Taskar</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:4D113318F9911978071D0A7B8FD0031994AF3C74</idno>
<date when="2008" year="2008">2008</date>
<idno type="doi">10.1007/978-3-540-88693-8_12</idno>
<idno type="url">https://api.istex.fr/document/4D113318F9911978071D0A7B8FD0031994AF3C74/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001E85</idno>
<idno type="wicri:Area/Istex/Curation">001D55</idno>
<idno type="wicri:Area/Istex/Checkpoint">000676</idno>
<idno type="wicri:doubleKey">0302-9743:2008:Cour T:movie:script:alignment</idno>
<idno type="wicri:Area/Main/Merge">000C21</idno>
<idno type="wicri:Area/Main/Curation">000C09</idno>
<idno type="wicri:Area/Main/Exploration">000C09</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Movie/Script: Alignment and Parsing of Video and Text Transcription</title>
<author>
<name sortKey="Cour, Timothee" sort="Cour, Timothee" uniqKey="Cour T" first="Timothee" last="Cour">Timothee Cour</name>
<affiliation wicri:level="2">
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>University of Pennsylvania, 19104, Philadelphia, PA</wicri:regionArea>
<placeName>
<region type="state">Pennsylvanie</region>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author>
<name sortKey="Jordan, Chris" sort="Jordan, Chris" uniqKey="Jordan C" first="Chris" last="Jordan">Chris Jordan</name>
<affiliation wicri:level="2">
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>University of Pennsylvania, 19104, Philadelphia, PA</wicri:regionArea>
<placeName>
<region type="state">Pennsylvanie</region>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author>
<name sortKey="Miltsakaki, Eleni" sort="Miltsakaki, Eleni" uniqKey="Miltsakaki E" first="Eleni" last="Miltsakaki">Eleni Miltsakaki</name>
<affiliation wicri:level="2">
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>University of Pennsylvania, 19104, Philadelphia, PA</wicri:regionArea>
<placeName>
<region type="state">Pennsylvanie</region>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author>
<name sortKey="Taskar, Ben" sort="Taskar, Ben" uniqKey="Taskar B" first="Ben" last="Taskar">Ben Taskar</name>
<affiliation wicri:level="2">
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>University of Pennsylvania, 19104, Philadelphia, PA</wicri:regionArea>
<placeName>
<region type="state">Pennsylvanie</region>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2008</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">4D113318F9911978071D0A7B8FD0031994AF3C74</idno>
<idno type="DOI">10.1007/978-3-540-88693-8_12</idno>
<idno type="ChapterID">12</idno>
<idno type="ChapterID">Chap12</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: Movies and TV are a rich source of diverse and complex video of people, objects, actions and locales “in the wild”. Harvesting automatically labeled sequences of actions from video would enable creation of large-scale and highly-varied datasets. To enable such collection, we focus on the task of recovering scene structure in movies and TV series for object tracking and action retrieval. We present a weakly supervised algorithm that uses the screenplay and closed captions to parse a movie into a hierarchy of shots and scenes. Scene boundaries in the movie are aligned with screenplay scene labels and shots are reordered into a sequence of long continuous tracks or threads which allow for more accurate tracking of people, actions and objects. Scene segmentation, alignment, and shot threading are formulated as inference in a unified generative model and a novel hierarchical dynamic programming algorithm that can handle alignment and jump-limited reorderings in linear time is presented. We present quantitative and qualitative results on movie alignment and parsing, and use the recovered structure to improve character naming and retrieval of common actions in several episodes of popular TV series.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
<region>
<li>Pennsylvanie</li>
</region>
</list>
<tree>
<country name="États-Unis">
<region name="Pennsylvanie">
<name sortKey="Cour, Timothee" sort="Cour, Timothee" uniqKey="Cour T" first="Timothee" last="Cour">Timothee Cour</name>
</region>
<name sortKey="Cour, Timothee" sort="Cour, Timothee" uniqKey="Cour T" first="Timothee" last="Cour">Timothee Cour</name>
<name sortKey="Jordan, Chris" sort="Jordan, Chris" uniqKey="Jordan C" first="Chris" last="Jordan">Chris Jordan</name>
<name sortKey="Jordan, Chris" sort="Jordan, Chris" uniqKey="Jordan C" first="Chris" last="Jordan">Chris Jordan</name>
<name sortKey="Miltsakaki, Eleni" sort="Miltsakaki, Eleni" uniqKey="Miltsakaki E" first="Eleni" last="Miltsakaki">Eleni Miltsakaki</name>
<name sortKey="Miltsakaki, Eleni" sort="Miltsakaki, Eleni" uniqKey="Miltsakaki E" first="Eleni" last="Miltsakaki">Eleni Miltsakaki</name>
<name sortKey="Taskar, Ben" sort="Taskar, Ben" uniqKey="Taskar B" first="Ben" last="Taskar">Ben Taskar</name>
<name sortKey="Taskar, Ben" sort="Taskar, Ben" uniqKey="Taskar B" first="Ben" last="Taskar">Ben Taskar</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000C09 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000C09 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:4D113318F9911978071D0A7B8FD0031994AF3C74
   |texte=   Movie/Script: Alignment and Parsing of Video and Text Transcription
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024